Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Monocular depth estimation method based on pyramid split attention network
Wenju LI, Mengying LI, Liu CUI, Wanghui CHU, Yi ZHANG, Hui GAO
Journal of Computer Applications    2023, 43 (6): 1736-1742.   DOI: 10.11772/j.issn.1001-9081.2022060852
Abstract254)   HTML11)    PDF (2767KB)(141)       Save

Aiming at the problem of inaccurate prediction of edges and the farthest region in monocular image depth estimation, a monocular depth estimation method based on Pyramid Split attention Network (PS-Net) was proposed. Firstly, based on Boundary-induced and Scene-aggregated Network (BS-Net), Pyramid Split Attention (PSA) module was introduced in PS-Net to process the spatial information of multi-scale features and effectively establish the long-term dependence between multi-scale channel attentions, thereby extracting the boundary with sharp change depth gradient and the farthest region. Then, the Mish function was used as the activation function in the decoder to further improve the performance of the network. Finally, training and evaluation were performed on NYUD v2 (New York University Depth dataset v2) and iBims-1 (independent Benchmark images and matched scans v1) datasets. Experimental results on iBims-1 dataset show that the proposed network reduced 1.42 percentage points compared with BS-Net in measuring Directed Depth Error (DDE), and has the proportion of correctly predicted depth pixels reached 81.69%. The above proves that the proposed network has high accuracy in depth prediction.

Table and Figures | Reference | Related Articles | Metrics
Lightweight traffic sign recognition model based on coordinate attention
Wenju LI, Gan ZHANG, Liu CUI, Wanghui CHU
Journal of Computer Applications    2023, 43 (2): 608-614.   DOI: 10.11772/j.issn.1001-9081.2022010100
Abstract416)   HTML12)    PDF (2351KB)(207)       Save

For the problems of unbalanced detection speed and recognition accuracy of traffic sign recognition models, and that it is difficult to detect occluded targets and small targets, YOLOv5 (You Only Look Once version 5) model was improved, and a lightweight traffic sign recognition model based on Coordinate Attention (CA) was proposed. Firstly, CA mechanism was integrated into the backbone network to effectively capture the relationships between location information and channels, so as to obtain the regions of interest more accurately and avoid too much computational overhead. Then, cross layer connections were added to the feature fusion network to fuse more feature information without increasing the cost, improve the feature extraction ability of the network and the detection effect of occluded targets. Finally, the improved CIoU (Complete Intersection over Union) function was introduced to calculate the localization loss, thereby alleviating the uneven distribution of sample size in the detection process, and further improving the recognition accuracy of small targets. Applying this model on TT100K (Tsinghua-Tencent 100K) dataset, the recognition accuracy is 91.5%, the recall is 86.64%, which are improved by 20.96% and 11.62% respectively compared with those of the traditional YOLOv5n model, and the frame processing rate is 140.84 FPS (Frames Per Second). These experimental results fully verify the accuracy and real-time performance of the proposed model for traffic sign detection and recognition in real scenes.

Table and Figures | Reference | Related Articles | Metrics